Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/Opcode Circuits for DIV, REM, and REMU opcodes #596

Open
wants to merge 28 commits into
base: master
Choose a base branch
from

Conversation

bgillesp
Copy link
Collaborator

@bgillesp bgillesp commented Nov 18, 2024

This PR extends the existing DIVU opcode implementation to expose the remainder as an output, and to handle the (more complicated) signed case. See the comments in div.rs for details, but in summary:

  • The basic design of signed and unsigned DIV and REM opcode circuits is to represent the division algorithm relation dividend = quotient * divisor + remainder using native field elements, and constraining so that either 0 <= remainder < divisor or divisor == 0 holds.
  • Some nontrivial steps are needed in the signed case to handle the additional complexity of different signs for the divisor and remainder values, as well as to handle signed division overflow where -2^31 / -1 = 2^31 is out of the representable range for i32 values.
  • In either case, either the quotient or the remainder value is exposed as the output register rd.
  • In theory with the current implementation, consecutive calls to DIV and REM or DIVU and REMU could be batched into a single circuit since both of the outputs are represented; however, this is not yet implemented.

Currently the circuit implementation and documentation are finished and ready for review. Unit tests still need to be finished, but it would be great to get some eyes on the circuit design now.

This was linked to issues Nov 19, 2024
@mcalancea mcalancea self-assigned this Dec 6, 2024
@kunxian-xia kunxian-xia requested a review from naure December 10, 2024 02:17
@kunxian-xia
Copy link
Collaborator

@bgillesp when will this pr be ready for review?

@bgillesp bgillesp changed the title [WIP] Feat/Opcode Circuits for DIV, REM, and REMU opcodes Feat/Opcode Circuits for DIV, REM, and REMU opcodes Dec 10, 2024
@bgillesp
Copy link
Collaborator Author

Unit tests still need to be added for the new opcode circuits, but everything else is in shape for an initial review -- will change the PR out of Draft now.

@bgillesp bgillesp marked this pull request as ready for review December 10, 2024 16:25
matthiasgoergens added a commit that referenced this pull request Dec 11, 2024
@@ -18,6 +18,7 @@ use crate::{

use super::SignedExtendConfig;

// TODO rename to AssertLtConfig (LT -> Lt) to fit naming conventions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #731

matthiasgoergens added a commit that referenced this pull request Dec 11, 2024
To help make #596 easier to
read and reason about.
matthiasgoergens added a commit that referenced this pull request Dec 11, 2024
To make the diff for #596 easier
to read.
Copy link
Collaborator

@matthiasgoergens matthiasgoergens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments and suggestions so far.

ceno_zkvm/src/gadgets/signed.rs Outdated Show resolved Hide resolved
ceno_zkvm/src/gadgets/signed.rs Outdated Show resolved Hide resolved
ceno_zkvm/src/gadgets/signed.rs Outdated Show resolved Hide resolved
ceno_zkvm/src/gadgets/signed_ext.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please merge #733 first, so that we can read the diff here better. Thanks!

You can solve any conflicts by just using your version of the code, because that's what we are trying to do here. You can either do that manually or via something like git merge --strategy recursive --strategy-option=ours origin/master

See https://github.com/scroll-tech/ceno/pull/734/files#diff-44194ded9672e9ed47ef78c46573cc0860f9317811d6be41b259afb554a989b4 for how that minimises and simplifies the diff.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#733 is merged.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished merging in master, hopefully the diff is a little nicer now.

kunxian-xia pushed a commit that referenced this pull request Dec 11, 2024
To make the diff for #596 easier
to read.
@matthiasgoergens
Copy link
Collaborator

@bgillesp I made a version of this PR that has an easier to read diff and left some comments there. They apply here as well. See #734

@hero78119
Copy link
Collaborator

hero78119 commented Dec 11, 2024

Attach DIVU circuit statistics 🔥

First row master branch, second row this PR

+---------------+---------------+---------+-------+-----------+--------+------------+---------------------+
| opcode_name   | num_instances | lookups | reads | witnesses | writes | 0_expr_deg | 0_expr_sumcheck_deg |
+---------------+---------------+---------+-------+-----------+--------+------------+---------------------+
| DIVU          | 0             | 21      | 4     | 37        | 4      | [1: 7]     | [2: 8]              | 
| DIVU          | 0             | 13      | 4     | 29        | 4      | [1: 5]     | [2: 5]              | 

@matthiasgoergens
Copy link
Collaborator

matthiasgoergens commented Dec 11, 2024

Attach DIVU circuit statistics 🔥

First row master branch, second row this PR

+---------------+---------------+---------+-------+-----------+--------+------------+---------------------+
| opcode_name   | num_instances | lookups | reads | witnesses | writes | 0_expr_deg | 0_expr_sumcheck_deg |
+---------------+---------------+---------+-------+-----------+--------+------------+---------------------+
| DIVU          | 0             | 21      | 4     | 37        | 4      | [1: 7]     | [2: 8]              | 
| DIVU          | 0             | 13      | 4     | 29        | 4      | [1: 5]     | [2: 5]              | 

Keep in mind that the PR has at least one bug (and that might affect the stats)?

ceno_zkvm/src/instructions/riscv/div.rs Outdated Show resolved Hide resolved
ceno_zkvm/src/instructions/riscv/div.rs Outdated Show resolved Hide resolved
&self,
instance: &mut [MaybeUninit<E::BaseField>],
lkm: &mut LkMultiplicity,
val: &Value<u32>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be better to use i32? Callers could assign i32::MAX + 1, and this could cause unexpeced result If the type isu32.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be a nontrivial change to implement -- the Value type seems to lean into the design of an "unsigned integer register" which can be interpreted as signed as desired, but not by default. In particular, the generic type T for Value<T> is currently constrained to Into<u64> + From<u32>, so u32 and u64 values are first-class citizens, but things break when I try to make it an i32. The gadget is really designed to provide a signed interpretation to an unsigned register value anyway, so I think that's why I originally chose the u32 representation. Okay if we leave it as is for now?

ceno_zkvm/src/instructions/riscv/div.rs Outdated Show resolved Hide resolved
ceno_zkvm/src/instructions/riscv/div.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@hero78119 hero78119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Epic job! thanks for the implementation and clear documentation.
Just do a quick overview and will proceed arithmetics constrain review 2nd round

ceno_zkvm/src/gadgets/signed_ext.rs Show resolved Hide resolved
ceno_zkvm/src/gadgets/signed.rs Outdated Show resolved Hide resolved
ceno_zkvm/src/instructions/riscv/div.rs Outdated Show resolved Hide resolved
Bryan Gillespie added 15 commits December 11, 2024 07:39
…opcodes

The current implementation tries to be generic over different UInt widths, but
the circuit design soundness relies on specifically using 32-bit registers in
the Goldilocks field, so it doesn't make sense to sacrifice readability for
unused generality here.
…ded value

The current assumption of the circuit design is that n_bits has to be 8 or 16,
so this change will better flag violations of this assumption if we end up
changing the base limb size to 32 bits.
@bgillesp
Copy link
Collaborator Author

Thank you for the generous feedback everyone! 🙏 I think I've addressed your comments as much as I can currently -- will keep an eye on this space to help move things forward as the review progresses.

matthiasgoergens added a commit that referenced this pull request Dec 12, 2024
To help make #596 easier to read
and reason about.

Also introduce a few more conversion helpers.
let neg_div_expr = {
let a_neg = dividend_signed.is_negative.expr();
let b_neg = divisor_signed.is_negative.expr();
&a_neg * (1 - &b_neg) + (1 - &a_neg) * &b_neg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really minor one, I like the original style more because It's clear and slightly efficient. It reduces the number of recursion here. But it depends on you :)

// a_neg * (1 - b_neg) + (1 - a_neg) * b_neg
(a_neg.clone() + b_neg.clone()) - (Expression::<E>::from(2) * a_neg * b_neg)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think the fact that it was simplified to monomial form was why I went with the original version, but then the current revision is closer to the underlying logic of the expression, which gives an easily readable meaning for binary inputs. I don't think I have a strong opinion about it -- any thoughts @matthiasgoergens since the change was your suggestion?

Copy link
Collaborator

@matthiasgoergens matthiasgoergens Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two parts to my suggestion. One is to remove the manual clones, and the other is to use a simpler algebraic expression.

They are independent of each other.

I don't really have much of an opinion on whether (a + b - 2 * a * b) or a * (1 - b) + (1 - a) * b is easier to read. I went with the latter, because someone had made a comment // a_neg * (1 - b_neg) + (1 - a_neg) * b_neg to that end; so I assume someone thought that form is easier to read.

Here's my suggestion, if you want to use the original expression:

// a_neg xor b_neg
&a_neg + &b_neg - 2 * a_neg * b_neg

Btw, when you say 'efficient', do you mean that the distribution function will be called slightly less often; or do you mean that the final resulting expression will be different?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, this seems also like a good way to write it, with the simple comment -- updated to this version in the latest commit.


Ok(())
}
}

// TODO Tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you forgot this TODO. I didn't see any testing added.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not forgotten! I mentioned that this is still TODO in the PR description -- another member of our team is working on it, but paused yesterday due to the rapid changes from the review. Should be finished relatively soon, but probably an initial review of the circuit design would still be valuable at this point before we validate with unit tests.

matthiasgoergens added a commit that referenced this pull request Dec 12, 2024
Implementing a TODO from #596
10to4 pushed a commit to 10to4/ceno that referenced this pull request Dec 12, 2024
To make the diff for scroll-tech#596 easier
to read.
10to4 pushed a commit to 10to4/ceno that referenced this pull request Dec 12, 2024
To help make scroll-tech#596 easier to read
and reason about.

Also introduce a few more conversion helpers.
10to4 pushed a commit to 10to4/ceno that referenced this pull request Dec 12, 2024
@bgillesp
Copy link
Collaborator Author

Note that while commit e719763 is showing failed tests, the cargo make tests command runs successfully on my local build, so I'm guessing it might be something stalled on the CI end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

remu rem div
6 participants